Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 84111 |
| Missing cells | 233356 |
| Missing cells (%) | 12.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 14.1 MiB |
| Average record size in memory | 176.0 B |
Variable types
| NUM | 21 |
|---|---|
| UNSUPPORTED | 1 |
reviews_count is highly correlated with labels_count | High correlation |
labels_count is highly correlated with reviews_count | High correlation |
winery_labels_count is highly correlated with winery_ratings_count | High correlation |
winery_ratings_count is highly correlated with winery_labels_count | High correlation |
wines_count is highly correlated with regions_count and 1 other fields | High correlation |
regions_count is highly correlated with wines_count and 1 other fields | High correlation |
wineries_count is highly correlated with regions_count and 1 other fields | High correlation |
price is highly correlated with median_price | High correlation |
median_price is highly correlated with price | High correlation |
median_price has 6100 (7.3%) missing values | Missing |
median_discount has 84111 (100.0%) missing values | Missing |
price_id has 45515 (54.1%) missing values | Missing |
price has 45515 (54.1%) missing values | Missing |
price_discount has 45515 (54.1%) missing values | Missing |
maybe_this_is_wine_id has 6100 (7.3%) missing values | Missing |
median_price is highly skewed (γ1 = 53.89933288) | Skewed |
price is highly skewed (γ1 = 35.93065778) | Skewed |
price_discount is highly skewed (γ1 = 22.39604389) | Skewed |
Unnamed: 0 has unique values | Unique |
median_discount is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
ratings_average has 5879 (7.0%) zeros | Zeros |
price_discount has 37553 (44.6%) zeros | Zeros |
Reproduction
| Analysis started | 2020-11-16 10:55:21.339489 |
|---|---|
| Analysis finished | 2020-11-16 10:56:43.725712 |
| Duration | 1 minute and 22.39 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 84111 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 51810651.53 |
|---|---|
| Minimum | 68 |
| Maximum | 166367612 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 68 |
|---|---|
| 5-th percentile | 1243108.5 |
| Q1 | 2418234 |
| median | 10929516 |
| Q3 | 142375561.5 |
| 95-th percentile | 158085457.5 |
| Maximum | 166367612 |
| Range | 166367544 |
| Interquartile range (IQR) | 139957327.5 |
Descriptive statistics
| Standard deviation | 65046801.3 |
|---|---|
| Coefficient of variation (CV) | 1.255471595 |
| Kurtosis | -1.230741414 |
| Mean | 51810651.53 |
| Median Absolute Deviation (MAD) | 9511567 |
| Skewness | 0.802177781 |
| Sum | 4.357845711e+12 |
| Variance | 4.231086359e+15 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 150734847 | 1 | < 0.1% | |
| 8651285 | 1 | < 0.1% | |
| 4521799 | 1 | < 0.1% | |
| 150495769 | 1 | < 0.1% | |
| 3308058 | 1 | < 0.1% | |
| 152664181 | 1 | < 0.1% | |
| 1696708 | 1 | < 0.1% | |
| 3949396 | 1 | < 0.1% | |
| 1471334 | 1 | < 0.1% | |
| 3233750 | 1 | < 0.1% | |
| Other values (84101) | 84101 | > 99.9% |
| Value | Count | Frequency (%) | |
| 68 | 1 | < 0.1% | |
| 1231 | 1 | < 0.1% | |
| 1533 | 1 | < 0.1% | |
| 1813 | 1 | < 0.1% | |
| 1822 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 166367612 | 1 | < 0.1% | |
| 166329234 | 1 | < 0.1% | |
| 166318437 | 1 | < 0.1% | |
| 166288782 | 1 | < 0.1% | |
| 166195316 | 1 | < 0.1% |
ratings_count
Real number (ℝ≥0)
| Distinct | 5899 |
|---|---|
| Distinct (%) | 7.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 787.6709229 |
|---|---|
| Minimum | 0 |
| Maximum | 122412 |
| Zeros | 6 |
| Zeros (%) | < 0.1% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 71 |
| median | 249 |
| Q3 | 704 |
| 95-th percentile | 2825 |
| Maximum | 122412 |
| Range | 122412 |
| Interquartile range (IQR) | 633 |
Descriptive statistics
| Standard deviation | 2464.052399 |
|---|---|
| Coefficient of variation (CV) | 3.128276451 |
| Kurtosis | 474.3853703 |
| Mean | 787.6709229 |
| Median Absolute Deviation (MAD) | 217 |
| Skewness | 16.48111828 |
| Sum | 66251789 |
| Variance | 6071554.226 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 1295 | 1.5% | |
| 2 | 821 | 1.0% | |
| 3 | 669 | 0.8% | |
| 4 | 563 | 0.7% | |
| 5 | 546 | 0.6% | |
| 6 | 538 | 0.6% | |
| 7 | 528 | 0.6% | |
| 9 | 458 | 0.5% | |
| 8 | 455 | 0.5% | |
| 12 | 417 | 0.5% | |
| Other values (5889) | 77821 | 92.5% |
| Value | Count | Frequency (%) | |
| 0 | 6 | < 0.1% | |
| 1 | 1295 | 1.5% | |
| 2 | 821 | 1.0% | |
| 3 | 669 | 0.8% | |
| 4 | 563 | 0.7% |
| Value | Count | Frequency (%) | |
| 122412 | 1 | < 0.1% | |
| 114706 | 1 | < 0.1% | |
| 112874 | 1 | < 0.1% | |
| 102035 | 1 | < 0.1% | |
| 101300 | 1 | < 0.1% |
| Distinct | 34 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.499179655 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 5879 |
| Zeros (%) | 7.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3.5 |
| median | 3.7 |
| Q3 | 4 |
| 95-th percentile | 4.4 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 1.024226938 |
|---|---|
| Coefficient of variation (CV) | 0.2927048734 |
| Kurtosis | 6.62306583 |
| Mean | 3.499179655 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | -2.680260237 |
| Sum | 294319.5 |
| Variance | 1.04904082 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3.8 | 9332 | 11.1% | |
| 3.7 | 8881 | 10.6% | |
| 3.6 | 7947 | 9.4% | |
| 3.9 | 7352 | 8.7% | |
| 3.5 | 6727 | 8.0% | |
| 0 | 5879 | 7.0% | |
| 4 | 5781 | 6.9% | |
| 4.1 | 5453 | 6.5% | |
| 3.4 | 4714 | 5.6% | |
| 4.2 | 4379 | 5.2% | |
| Other values (24) | 17666 | 21.0% |
| Value | Count | Frequency (%) | |
| 0 | 5879 | 7.0% | |
| 1.8 | 1 | < 0.1% | |
| 1.9 | 1 | < 0.1% | |
| 2 | 2 | < 0.1% | |
| 2.1 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5 | 1 | < 0.1% | |
| 4.9 | 16 | < 0.1% | |
| 4.8 | 107 | 0.1% | |
| 4.7 | 460 | 0.5% | |
| 4.6 | 767 | 0.9% |
| Distinct | 15392 |
|---|---|
| Distinct (%) | 18.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3847.355031 |
|---|---|
| Minimum | 0 |
| Maximum | 588471 |
| Zeros | 156 |
| Zeros (%) | 0.2% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 335 |
| median | 1342 |
| Q3 | 4048 |
| 95-th percentile | 14878.5 |
| Maximum | 588471 |
| Range | 588471 |
| Interquartile range (IQR) | 3713 |
Descriptive statistics
| Standard deviation | 8782.923904 |
|---|---|
| Coefficient of variation (CV) | 2.282847263 |
| Kurtosis | 535.90951 |
| Mean | 3847.355031 |
| Median Absolute Deviation (MAD) | 1217 |
| Skewness | 13.80421095 |
| Sum | 323604879 |
| Variance | 77139752.3 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 421 | 0.5% | |
| 2 | 364 | 0.4% | |
| 3 | 331 | 0.4% | |
| 4 | 261 | 0.3% | |
| 6 | 254 | 0.3% | |
| 5 | 232 | 0.3% | |
| 8 | 216 | 0.3% | |
| 7 | 206 | 0.2% | |
| 12 | 200 | 0.2% | |
| 10 | 191 | 0.2% | |
| Other values (15382) | 81435 | 96.8% |
| Value | Count | Frequency (%) | |
| 0 | 156 | 0.2% | |
| 1 | 421 | 0.5% | |
| 2 | 364 | 0.4% | |
| 3 | 331 | 0.4% | |
| 4 | 261 | 0.3% |
| Value | Count | Frequency (%) | |
| 588471 | 1 | < 0.1% | |
| 564974 | 1 | < 0.1% | |
| 290285 | 1 | < 0.1% | |
| 247702 | 1 | < 0.1% | |
| 235198 | 1 | < 0.1% |
| Distinct | 2143 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 175.5027999 |
|---|---|
| Minimum | 0 |
| Maximum | 22669 |
| Zeros | 15 |
| Zeros (%) | < 0.1% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 19 |
| median | 69 |
| Q3 | 193 |
| 95-th percentile | 667 |
| Maximum | 22669 |
| Range | 22669 |
| Interquartile range (IQR) | 174 |
Descriptive statistics
| Standard deviation | 361.7074452 |
|---|---|
| Coefficient of variation (CV) | 2.060978204 |
| Kurtosis | 445.1783967 |
| Mean | 175.5027999 |
| Median Absolute Deviation (MAD) | 60 |
| Skewness | 12.57423682 |
| Sum | 14761716 |
| Variance | 130832.2759 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 3033 | 3.6% | |
| 2 | 1836 | 2.2% | |
| 3 | 1601 | 1.9% | |
| 4 | 1348 | 1.6% | |
| 6 | 1233 | 1.5% | |
| 5 | 1180 | 1.4% | |
| 7 | 1126 | 1.3% | |
| 8 | 1021 | 1.2% | |
| 9 | 996 | 1.2% | |
| 10 | 920 | 1.1% | |
| Other values (2133) | 69817 | 83.0% |
| Value | Count | Frequency (%) | |
| 0 | 15 | < 0.1% | |
| 1 | 3033 | 3.6% | |
| 2 | 1836 | 2.2% | |
| 3 | 1601 | 1.9% | |
| 4 | 1348 | 1.6% |
| Value | Count | Frequency (%) | |
| 22669 | 1 | < 0.1% | |
| 20842 | 1 | < 0.1% | |
| 16416 | 1 | < 0.1% | |
| 13353 | 1 | < 0.1% | |
| 10667 | 1 | < 0.1% |
wine_id
Real number (ℝ≥0)
| Distinct | 14326 |
|---|---|
| Distinct (%) | 17.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1576490.005 |
|---|---|
| Minimum | 20 |
| Maximum | 8815980 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 7073 |
| Q1 | 1105336 |
| median | 1210005 |
| Q3 | 1818912 |
| 95-th percentile | 5442993 |
| Maximum | 8815980 |
| Range | 8815960 |
| Interquartile range (IQR) | 713576 |
Descriptive statistics
| Standard deviation | 1530922.465 |
|---|---|
| Coefficient of variation (CV) | 0.9710955733 |
| Kurtosis | 3.349475964 |
| Mean | 1576490.005 |
| Median Absolute Deviation (MAD) | 550696 |
| Skewness | 1.779224708 |
| Sum | 1.326001508e+11 |
| Variance | 2.343723595e+12 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1127795 | 32 | < 0.1% | |
| 3908 | 28 | < 0.1% | |
| 1167555 | 27 | < 0.1% | |
| 1168676 | 27 | < 0.1% | |
| 7360 | 27 | < 0.1% | |
| 1655970 | 27 | < 0.1% | |
| 9220 | 27 | < 0.1% | |
| 1166837 | 27 | < 0.1% | |
| 1684223 | 27 | < 0.1% | |
| 1152755 | 26 | < 0.1% | |
| Other values (14316) | 83836 | 99.7% |
| Value | Count | Frequency (%) | |
| 20 | 12 | < 0.1% | |
| 128 | 3 | < 0.1% | |
| 271 | 9 | < 0.1% | |
| 287 | 4 | < 0.1% | |
| 296 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 8815980 | 2 | < 0.1% | |
| 8636113 | 2 | < 0.1% | |
| 8544388 | 2 | < 0.1% | |
| 8479757 | 6 | < 0.1% | |
| 8479753 | 4 | < 0.1% |
winery_id
Real number (ℝ≥0)
| Distinct | 5896 |
|---|---|
| Distinct (%) | 7.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39101.442 |
|---|---|
| Minimum | 24 |
| Maximum | 286353 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 24 |
|---|---|
| 5-th percentile | 1308 |
| Q1 | 3297 |
| median | 11265 |
| Q3 | 32084 |
| 95-th percentile | 220621 |
| Maximum | 286353 |
| Range | 286329 |
| Interquartile range (IQR) | 28787 |
Descriptive statistics
| Standard deviation | 65931.11193 |
|---|---|
| Coefficient of variation (CV) | 1.686155511 |
| Kurtosis | 3.32902247 |
| Mean | 39101.442 |
| Median Absolute Deviation (MAD) | 8981 |
| Skewness | 2.144024615 |
| Sum | 3288861388 |
| Variance | 4346911520 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 6188 | 418 | 0.5% | |
| 7831 | 361 | 0.4% | |
| 1521 | 333 | 0.4% | |
| 1308 | 304 | 0.4% | |
| 8074 | 260 | 0.3% | |
| 1941 | 244 | 0.3% | |
| 1305 | 242 | 0.3% | |
| 1432 | 235 | 0.3% | |
| 11035 | 232 | 0.3% | |
| 4009 | 231 | 0.3% | |
| Other values (5886) | 81251 | 96.6% |
| Value | Count | Frequency (%) | |
| 24 | 6 | < 0.1% | |
| 30 | 3 | < 0.1% | |
| 37 | 3 | < 0.1% | |
| 38 | 10 | < 0.1% | |
| 45 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 286353 | 3 | < 0.1% | |
| 286070 | 2 | < 0.1% | |
| 285862 | 7 | < 0.1% | |
| 285813 | 31 | < 0.1% | |
| 285602 | 2 | < 0.1% |
| Distinct | 8248 |
|---|---|
| Distinct (%) | 9.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49638.02944 |
|---|---|
| Minimum | 21 |
| Maximum | 535578 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 1428 |
| Q1 | 6879.5 |
| median | 21223 |
| Q3 | 58880 |
| 95-th percentile | 195671 |
| Maximum | 535578 |
| Range | 535557 |
| Interquartile range (IQR) | 52000.5 |
Descriptive statistics
| Standard deviation | 71600.4922 |
|---|---|
| Coefficient of variation (CV) | 1.44245235 |
| Kurtosis | 9.233240121 |
| Mean | 49638.02944 |
| Median Absolute Deviation (MAD) | 17158 |
| Skewness | 2.692411408 |
| Sum | 4175104294 |
| Variance | 5126630484 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 110881 | 258 | 0.3% | |
| 143334 | 220 | 0.3% | |
| 141641 | 194 | 0.2% | |
| 20084 | 159 | 0.2% | |
| 192795 | 156 | 0.2% | |
| 535565 | 138 | 0.2% | |
| 100199 | 136 | 0.2% | |
| 178525 | 130 | 0.2% | |
| 84492 | 123 | 0.1% | |
| 29540 | 114 | 0.1% | |
| Other values (8238) | 82483 | 98.1% |
| Value | Count | Frequency (%) | |
| 21 | 2 | < 0.1% | |
| 27 | 1 | < 0.1% | |
| 36 | 2 | < 0.1% | |
| 41 | 2 | < 0.1% | |
| 45 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 535578 | 6 | < 0.1% | |
| 535574 | 14 | < 0.1% | |
| 535568 | 7 | < 0.1% | |
| 535565 | 138 | 0.2% | |
| 466146 | 4 | < 0.1% |
winery_ratings_average
Real number (ℝ≥0)
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.761229803 |
|---|---|
| Minimum | 2.4 |
| Maximum | 4.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 2.4 |
|---|---|
| 5-th percentile | 3.3 |
| Q1 | 3.6 |
| median | 3.8 |
| Q3 | 3.9 |
| 95-th percentile | 4.2 |
| Maximum | 4.7 |
| Range | 2.3 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.2785352073 |
|---|---|
| Coefficient of variation (CV) | 0.07405429125 |
| Kurtosis | 0.606037061 |
| Mean | 3.761229803 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 0.06594726394 |
| Sum | 316360.8 |
| Variance | 0.07758186172 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3.8 | 12484 | 14.8% | |
| 3.7 | 12244 | 14.6% | |
| 3.9 | 11362 | 13.5% | |
| 3.6 | 9769 | 11.6% | |
| 4 | 8351 | 9.9% | |
| 3.5 | 8074 | 9.6% | |
| 3.4 | 5652 | 6.7% | |
| 4.1 | 4769 | 5.7% | |
| 4.2 | 2854 | 3.4% | |
| 3.3 | 2703 | 3.2% | |
| Other values (14) | 5849 | 7.0% |
| Value | Count | Frequency (%) | |
| 2.4 | 8 | < 0.1% | |
| 2.5 | 17 | < 0.1% | |
| 2.6 | 40 | < 0.1% | |
| 2.7 | 31 | < 0.1% | |
| 2.8 | 63 | 0.1% |
| Value | Count | Frequency (%) | |
| 4.7 | 213 | 0.3% | |
| 4.6 | 309 | 0.4% | |
| 4.5 | 534 | 0.6% | |
| 4.4 | 650 | 0.8% | |
| 4.3 | 1896 | 2.3% |
| Distinct | 9061 |
|---|---|
| Distinct (%) | 10.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 360763.67 |
|---|---|
| Minimum | 369 |
| Maximum | 3930070 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 369 |
|---|---|
| 5-th percentile | 9972 |
| Q1 | 47881 |
| median | 141037 |
| Q3 | 392307 |
| 95-th percentile | 1530764 |
| Maximum | 3930070 |
| Range | 3929701 |
| Interquartile range (IQR) | 344426 |
Descriptive statistics
| Standard deviation | 545867.342 |
|---|---|
| Coefficient of variation (CV) | 1.513088449 |
| Kurtosis | 8.689641374 |
| Mean | 360763.67 |
| Median Absolute Deviation (MAD) | 113632 |
| Skewness | 2.711867721 |
| Sum | 3.034419305e+10 |
| Variance | 2.97971155e+11 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 813070 | 258 | 0.3% | |
| 1405362 | 220 | 0.3% | |
| 240510 | 159 | 0.2% | |
| 1312598 | 156 | 0.2% | |
| 1153318 | 136 | 0.2% | |
| 1689251 | 130 | 0.2% | |
| 490270 | 123 | 0.1% | |
| 915780 | 117 | 0.1% | |
| 130806 | 114 | 0.1% | |
| 223631 | 108 | 0.1% | |
| Other values (9051) | 82590 | 98.2% |
| Value | Count | Frequency (%) | |
| 369 | 1 | < 0.1% | |
| 403 | 3 | < 0.1% | |
| 465 | 4 | < 0.1% | |
| 474 | 4 | < 0.1% | |
| 501 | 7 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3930070 | 4 | < 0.1% | |
| 3929689 | 2 | < 0.1% | |
| 3928754 | 6 | < 0.1% | |
| 3928687 | 10 | < 0.1% | |
| 3928626 | 29 | < 0.1% |
winery_wines_count
Real number (ℝ≥0)
| Distinct | 235 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 55.11914018 |
|---|---|
| Minimum | 1 |
| Maximum | 858 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 31 |
| Q3 | 71 |
| 95-th percentile | 195 |
| Maximum | 858 |
| Range | 857 |
| Interquartile range (IQR) | 60 |
Descriptive statistics
| Standard deviation | 67.99940196 |
|---|---|
| Coefficient of variation (CV) | 1.233680383 |
| Kurtosis | 8.954824001 |
| Mean | 55.11914018 |
| Median Absolute Deviation (MAD) | 23 |
| Skewness | 2.512623648 |
| Sum | 4636126 |
| Variance | 4623.918667 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 2242 | 2.7% | |
| 6 | 2239 | 2.7% | |
| 3 | 2223 | 2.6% | |
| 2 | 2092 | 2.5% | |
| 7 | 1993 | 2.4% | |
| 4 | 1968 | 2.3% | |
| 8 | 1944 | 2.3% | |
| 9 | 1930 | 2.3% | |
| 10 | 1831 | 2.2% | |
| 12 | 1724 | 2.0% | |
| Other values (225) | 63925 | 76.0% |
| Value | Count | Frequency (%) | |
| 1 | 1697 | 2.0% | |
| 2 | 2092 | 2.5% | |
| 3 | 2223 | 2.6% | |
| 4 | 1968 | 2.3% | |
| 5 | 2242 | 2.7% |
| Value | Count | Frequency (%) | |
| 858 | 9 | < 0.1% | |
| 711 | 2 | < 0.1% | |
| 513 | 27 | < 0.1% | |
| 480 | 33 | < 0.1% | |
| 463 | 4 | < 0.1% |
region_id
Real number (ℝ≥0)
| Distinct | 957 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 987.6646455 |
|---|---|
| Minimum | 0 |
| Maximum | 4664 |
| Zeros | 125 |
| Zeros (%) | 0.1% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 25 |
| Q1 | 429 |
| median | 649 |
| Q3 | 1395 |
| 95-th percentile | 3048 |
| Maximum | 4664 |
| Range | 4664 |
| Interquartile range (IQR) | 966 |
Descriptive statistics
| Standard deviation | 916.4910594 |
|---|---|
| Coefficient of variation (CV) | 0.9279374974 |
| Kurtosis | 1.69895419 |
| Mean | 987.6646455 |
| Median Absolute Deviation (MAD) | 240 |
| Skewness | 1.524355044 |
| Sum | 83073461 |
| Variance | 839955.862 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 24 | 2763 | 3.3% | |
| 492 | 2600 | 3.1% | |
| 454 | 2447 | 2.9% | |
| 718 | 1747 | 2.1% | |
| 25 | 1730 | 2.1% | |
| 593 | 1689 | 2.0% | |
| 433 | 1680 | 2.0% | |
| 827 | 1469 | 1.7% | |
| 405 | 1157 | 1.4% | |
| 394 | 1146 | 1.4% | |
| Other values (947) | 65683 | 78.1% |
| Value | Count | Frequency (%) | |
| 0 | 125 | 0.1% | |
| 7 | 601 | 0.7% | |
| 9 | 15 | < 0.1% | |
| 24 | 2763 | 3.3% | |
| 25 | 1730 | 2.1% |
| Value | Count | Frequency (%) | |
| 4664 | 7 | < 0.1% | |
| 4662 | 6 | < 0.1% | |
| 4660 | 23 | < 0.1% | |
| 4658 | 18 | < 0.1% | |
| 4656 | 5 | < 0.1% |
users_count
Real number (ℝ≥0)
| Distinct | 45 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 125 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3075068.688 |
|---|---|
| Minimum | 4415 |
| Maximum | 9314284 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 4415 |
|---|---|
| 5-th percentile | 167250 |
| Q1 | 820134 |
| median | 3395333 |
| Q3 | 3975455 |
| 95-th percentile | 9314284 |
| Maximum | 9314284 |
| Range | 9309869 |
| Interquartile range (IQR) | 3155321 |
Descriptive statistics
| Standard deviation | 2755912.94 |
|---|---|
| Coefficient of variation (CV) | 0.8962118311 |
| Kurtosis | 0.611632474 |
| Mean | 3075068.688 |
| Median Absolute Deviation (MAD) | 1593254 |
| Skewness | 1.179204012 |
| Sum | 2.582627189e+11 |
| Variance | 7.595056132e+12 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3975455 | 17727 | 21.1% | |
| 3395333 | 15273 | 18.2% | |
| 9314284 | 10635 | 12.6% | |
| 1802079 | 10529 | 12.5% | |
| 820134 | 6705 | 8.0% | |
| 255535 | 5318 | 6.3% | |
| 484656 | 4434 | 5.3% | |
| 754981 | 4282 | 5.1% | |
| 205404 | 1860 | 2.2% | |
| 1722751 | 1226 | 1.5% | |
| Other values (35) | 5997 | 7.1% |
| Value | Count | Frequency (%) | |
| 4415 | 15 | < 0.1% | |
| 5457 | 17 | < 0.1% | |
| 6803 | 27 | < 0.1% | |
| 7665 | 525 | 0.6% | |
| 8419 | 38 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9314284 | 10635 | 12.6% | |
| 3975455 | 17727 | 21.1% | |
| 3518265 | 378 | 0.4% | |
| 3395333 | 15273 | 18.2% | |
| 2298654 | 4 | < 0.1% |
| Distinct | 32 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 125 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 459.8737766 |
|---|---|
| Minimum | 1 |
| Maximum | 1280 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 35 |
| Q1 | 91 |
| median | 353 |
| Q3 | 554 |
| 95-th percentile | 1280 |
| Maximum | 1280 |
| Range | 1279 |
| Interquartile range (IQR) | 463 |
Descriptive statistics
| Standard deviation | 459.7209725 |
|---|---|
| Coefficient of variation (CV) | 0.9996677261 |
| Kurtosis | -0.6403842755 |
| Mean | 459.8737766 |
| Median Absolute Deviation (MAD) | 234 |
| Skewness | 0.9372912939 |
| Sum | 38622959 |
| Variance | 211343.3726 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1280 | 17727 | 21.1% | |
| 554 | 15273 | 18.2% | |
| 353 | 10635 | 12.6% | |
| 149 | 10529 | 12.5% | |
| 37 | 9976 | 11.9% | |
| 91 | 6705 | 8.0% | |
| 119 | 4282 | 5.1% | |
| 110 | 1860 | 2.2% | |
| 35 | 1594 | 1.9% | |
| 210 | 1226 | 1.5% | |
| Other values (22) | 4179 | 5.0% |
| Value | Count | Frequency (%) | |
| 1 | 42 | < 0.1% | |
| 2 | 127 | 0.2% | |
| 3 | 261 | 0.3% | |
| 4 | 17 | < 0.1% | |
| 5 | 40 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1280 | 17727 | 21.1% | |
| 554 | 15273 | 18.2% | |
| 353 | 10635 | 12.6% | |
| 210 | 1226 | 1.5% | |
| 149 | 10529 | 12.5% |
| Distinct | 45 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 125 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 223930.4964 |
|---|---|
| Minimum | 268 |
| Maximum | 490885 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 268 |
|---|---|
| 5-th percentile | 17180 |
| Q1 | 47366 |
| median | 223615 |
| Q3 | 324753 |
| 95-th percentile | 490885 |
| Maximum | 490885 |
| Range | 490617 |
| Interquartile range (IQR) | 277387 |
Descriptive statistics
| Standard deviation | 171408.8656 |
|---|---|
| Coefficient of variation (CV) | 0.7654556589 |
| Kurtosis | -1.270679232 |
| Mean | 223930.4964 |
| Median Absolute Deviation (MAD) | 176249 |
| Skewness | 0.3974350883 |
| Sum | 1.880702667e+10 |
| Variance | 2.938099922e+10 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 490885 | 17727 | 21.1% | |
| 324753 | 15273 | 18.2% | |
| 223615 | 10635 | 12.6% | |
| 117791 | 10529 | 12.5% | |
| 45675 | 6705 | 8.0% | |
| 47366 | 5318 | 6.3% | |
| 41242 | 4434 | 5.3% | |
| 99696 | 4282 | 5.1% | |
| 34483 | 1860 | 2.2% | |
| 177443 | 1226 | 1.5% | |
| Other values (35) | 5997 | 7.1% |
| Value | Count | Frequency (%) | |
| 268 | 15 | < 0.1% | |
| 356 | 2 | < 0.1% | |
| 428 | 12 | < 0.1% | |
| 514 | 25 | < 0.1% | |
| 620 | 10 | < 0.1% |
| Value | Count | Frequency (%) | |
| 490885 | 17727 | 21.1% | |
| 324753 | 15273 | 18.2% | |
| 223615 | 10635 | 12.6% | |
| 177443 | 1226 | 1.5% | |
| 117791 | 10529 | 12.5% |
| Distinct | 44 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 125 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28291.56793 |
|---|---|
| Minimum | 39 |
| Maximum | 65178 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 39 |
|---|---|
| 5-th percentile | 1732 |
| Q1 | 5410 |
| median | 25755 |
| Q3 | 39341 |
| 95-th percentile | 65178 |
| Maximum | 65178 |
| Range | 65139 |
| Interquartile range (IQR) | 33931 |
Descriptive statistics
| Standard deviation | 22651.75846 |
|---|---|
| Coefficient of variation (CV) | 0.8006540505 |
| Kurtosis | -1.09409929 |
| Mean | 28291.56793 |
| Median Absolute Deviation (MAD) | 20345 |
| Skewness | 0.5365098139 |
| Sum | 2376095624 |
| Variance | 513102161.1 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 65178 | 17727 | 21.1% | |
| 39341 | 15273 | 18.2% | |
| 25755 | 10635 | 12.6% | |
| 16720 | 10529 | 12.5% | |
| 5410 | 6705 | 8.0% | |
| 4995 | 5318 | 6.3% | |
| 4656 | 4434 | 5.3% | |
| 12674 | 4282 | 5.1% | |
| 3821 | 1860 | 2.2% | |
| 13073 | 1226 | 1.5% | |
| Other values (34) | 5997 | 7.1% |
| Value | Count | Frequency (%) | |
| 39 | 12 | < 0.1% | |
| 46 | 15 | < 0.1% | |
| 47 | 2 | < 0.1% | |
| 101 | 17 | < 0.1% | |
| 102 | 184 | 0.2% |
| Value | Count | Frequency (%) | |
| 65178 | 17727 | 21.1% | |
| 39341 | 15273 | 18.2% | |
| 25755 | 10635 | 12.6% | |
| 16720 | 10529 | 12.5% | |
| 13073 | 1226 | 1.5% |
| Distinct | 16572 |
|---|---|
| Distinct (%) | 21.2% |
| Missing | 6100 |
| Missing (%) | 7.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 323.7827782 |
|---|---|
| Minimum | 0 |
| Maximum | 287626.5 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 32.94 |
| Q1 | 63.85 |
| median | 109 |
| Q3 | 220 |
| 95-th percentile | 764.97 |
| Maximum | 287626.5 |
| Range | 287626.5 |
| Interquartile range (IQR) | 156.15 |
Descriptive statistics
| Standard deviation | 2722.572283 |
|---|---|
| Coefficient of variation (CV) | 8.408638341 |
| Kurtosis | 3698.763906 |
| Mean | 323.7827782 |
| Median Absolute Deviation (MAD) | 58.5 |
| Skewness | 53.89933288 |
| Sum | 25258618.31 |
| Variance | 7412399.836 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 99 | 690 | 0.8% | |
| 199 | 516 | 0.6% | |
| 50 | 432 | 0.5% | |
| 149 | 419 | 0.5% | |
| 299 | 389 | 0.5% | |
| 79 | 381 | 0.5% | |
| 139 | 354 | 0.4% | |
| 129 | 340 | 0.4% | |
| 89 | 329 | 0.4% | |
| 69 | 321 | 0.4% | |
| Other values (16562) | 73840 | 87.8% | |
| (Missing) | 6100 | 7.3% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 0.01 | 9 | < 0.1% | |
| 0.02 | 5 | < 0.1% | |
| 0.03 | 3 | < 0.1% | |
| 0.04 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 287626.5 | 1 | < 0.1% | |
| 207156.76 | 1 | < 0.1% | |
| 145000 | 16 | < 0.1% | |
| 100000 | 1 | < 0.1% | |
| 79205.85 | 1 | < 0.1% |
| Distinct | 9368 |
|---|---|
| Distinct (%) | 24.3% |
| Missing | 45515 |
| Missing (%) | 54.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17910028.66 |
|---|---|
| Minimum | 4069 |
| Maximum | 23133325 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 4069 |
|---|---|
| 5-th percentile | 4240023 |
| Q1 | 16079769.5 |
| median | 20208210 |
| Q3 | 21847988.25 |
| 95-th percentile | 22960256 |
| Maximum | 23133325 |
| Range | 23129256 |
| Interquartile range (IQR) | 5768218.75 |
Descriptive statistics
| Standard deviation | 5703746.263 |
|---|---|
| Coefficient of variation (CV) | 0.3184666184 |
| Kurtosis | 1.539190253 |
| Mean | 17910028.66 |
| Median Absolute Deviation (MAD) | 2297900 |
| Skewness | -1.534080465 |
| Sum | 6.912554662e+11 |
| Variance | 3.253272143e+13 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 17914357 | 32 | < 0.1% | |
| 20871953 | 28 | < 0.1% | |
| 22995760 | 27 | < 0.1% | |
| 20238355 | 27 | < 0.1% | |
| 20415514 | 27 | < 0.1% | |
| 21794001 | 27 | < 0.1% | |
| 21012977 | 26 | < 0.1% | |
| 17572353 | 26 | < 0.1% | |
| 19324511 | 25 | < 0.1% | |
| 19990352 | 23 | < 0.1% | |
| Other values (9358) | 38328 | 45.6% | |
| (Missing) | 45515 | 54.1% |
| Value | Count | Frequency (%) | |
| 4069 | 2 | < 0.1% | |
| 4219 | 6 | < 0.1% | |
| 29589 | 1 | < 0.1% | |
| 29631 | 6 | < 0.1% | |
| 35513 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 23133325 | 7 | < 0.1% | |
| 23132132 | 14 | < 0.1% | |
| 23131789 | 1 | < 0.1% | |
| 23131631 | 11 | < 0.1% | |
| 23131625 | 1 | < 0.1% |
| Distinct | 3068 |
|---|---|
| Distinct (%) | 7.9% |
| Missing | 45515 |
| Missing (%) | 54.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 493.8186174 |
|---|---|
| Minimum | 15 |
| Maximum | 145000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 48.52 |
| Q1 | 89.57 |
| median | 163.46 |
| Q3 | 333.715 |
| 95-th percentile | 1320 |
| Maximum | 145000 |
| Range | 144985 |
| Interquartile range (IQR) | 244.145 |
Descriptive statistics
| Standard deviation | 3310.11092 |
|---|---|
| Coefficient of variation (CV) | 6.703090574 |
| Kurtosis | 1516.422783 |
| Mean | 493.8186174 |
| Median Absolute Deviation (MAD) | 89.94 |
| Skewness | 35.93065778 |
| Sum | 19059423.36 |
| Variance | 10956834.3 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 99 | 609 | 0.7% | |
| 199 | 493 | 0.6% | |
| 149 | 419 | 0.5% | |
| 299 | 355 | 0.4% | |
| 129 | 318 | 0.4% | |
| 139 | 311 | 0.4% | |
| 79 | 299 | 0.4% | |
| 399 | 295 | 0.4% | |
| 179 | 268 | 0.3% | |
| 89 | 262 | 0.3% | |
| Other values (3058) | 34967 | 41.6% | |
| (Missing) | 45515 | 54.1% |
| Value | Count | Frequency (%) | |
| 15 | 8 | < 0.1% | |
| 21 | 5 | < 0.1% | |
| 24.833334 | 38 | < 0.1% | |
| 25 | 2 | < 0.1% | |
| 25.38 | 7 | < 0.1% |
| Value | Count | Frequency (%) | |
| 145000 | 16 | < 0.1% | |
| 47735.4 | 1 | < 0.1% | |
| 47134.56 | 2 | < 0.1% | |
| 41200.75 | 1 | < 0.1% | |
| 37711.03 | 5 | < 0.1% |
| Distinct | 139 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 45515 |
| Missing (%) | 54.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.296261115 |
|---|---|
| Minimum | 0 |
| Maximum | 4075.19 |
| Zeros | 37553 |
| Zeros (%) | 44.6% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 4075.19 |
| Range | 4075.19 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 75.02795536 |
|---|---|
| Coefficient of variation (CV) | 9.043586541 |
| Kurtosis | 854.0718843 |
| Mean | 8.296261115 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 22.39604389 |
| Sum | 320202.494 |
| Variance | 5629.194086 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 37553 | 44.6% | |
| 229 | 47 | 0.1% | |
| 129 | 40 | < 0.1% | |
| 239.95 | 32 | < 0.1% | |
| 299 | 32 | < 0.1% | |
| 169 | 31 | < 0.1% | |
| 199 | 29 | < 0.1% | |
| 169.95001 | 25 | < 0.1% | |
| 499 | 23 | < 0.1% | |
| 189.95 | 20 | < 0.1% | |
| Other values (129) | 764 | 0.9% | |
| (Missing) | 45515 | 54.1% |
| Value | Count | Frequency (%) | |
| 0 | 37553 | 44.6% | |
| 49.5 | 9 | < 0.1% | |
| 57.5 | 2 | < 0.1% | |
| 59.5 | 1 | < 0.1% | |
| 63.92 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4075.19 | 3 | < 0.1% | |
| 1999 | 9 | < 0.1% | |
| 1650 | 1 | < 0.1% | |
| 1516.73 | 1 | < 0.1% | |
| 1398 | 1 | < 0.1% |
| Distinct | 48783 |
|---|---|
| Distinct (%) | 62.5% |
| Missing | 6100 |
| Missing (%) | 7.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 65907520.23 |
|---|---|
| Minimum | 68 |
| Maximum | 166329234 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 657.1 KiB |
Quantile statistics
| Minimum | 68 |
|---|---|
| 5-th percentile | 1346476 |
| Q1 | 3082608.5 |
| median | 18651979 |
| Q3 | 150582709 |
| 95-th percentile | 159679556 |
| Maximum | 166329234 |
| Range | 166329166 |
| Interquartile range (IQR) | 147500100.5 |
Descriptive statistics
| Standard deviation | 69253180.07 |
|---|---|
| Coefficient of variation (CV) | 1.050762945 |
| Kurtosis | -1.750855357 |
| Mean | 65907520.23 |
| Median Absolute Deviation (MAD) | 17228971 |
| Skewness | 0.3777916018 |
| Sum | 5.141511561e+12 |
| Variance | 4.79600295e+15 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1330584 | 32 | < 0.1% | |
| 1599557 | 28 | < 0.1% | |
| 154899645 | 27 | < 0.1% | |
| 1332169 | 27 | < 0.1% | |
| 1387528 | 27 | < 0.1% | |
| 2657826 | 27 | < 0.1% | |
| 4039840 | 26 | < 0.1% | |
| 4194936 | 26 | < 0.1% | |
| 135218685 | 25 | < 0.1% | |
| 121906870 | 23 | < 0.1% | |
| Other values (48773) | 77743 | 92.4% | |
| (Missing) | 6100 | 7.3% |
| Value | Count | Frequency (%) | |
| 68 | 1 | < 0.1% | |
| 1231 | 1 | < 0.1% | |
| 2079 | 1 | < 0.1% | |
| 2083 | 1 | < 0.1% | |
| 2201 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 166329234 | 1 | < 0.1% | |
| 166318437 | 1 | < 0.1% | |
| 166034013 | 1 | < 0.1% | |
| 165921332 | 1 | < 0.1% | |
| 165842803 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Unnamed: 0 | ratings_count | ratings_average | labels_count | reviews_count | wine_id | winery_id | winery_ratings_count | winery_ratings_average | winery_labels_count | winery_wines_count | region_id | users_count | regions_count | wines_count | wineries_count | median_price | median_discount | price_id | price | price_discount | maybe_this_is_wine_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 22505105 | 481 | 3.7 | 4992 | 138 | 1577018 | 6802 | 26413 | 3.5 | 334302 | 90 | 593 | 255535.0 | 37.0 | 47366.0 | 4995.0 | 44.31 | NaN | NaN | NaN | NaN | 22505105.0 |
| 1 | 5625084 | 1358 | 4.4 | 6279 | 407 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 248.89 | NaN | NaN | NaN | NaN | 5625084.0 |
| 2 | 2117077 | 1118 | 4.2 | 3841 | 336 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 206.73 | NaN | NaN | NaN | NaN | 2117077.0 |
| 3 | 2223038 | 1675 | 4.4 | 6678 | 532 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 289.69 | NaN | NaN | NaN | NaN | 2223038.0 |
| 4 | 2386146 | 1607 | 4.2 | 5377 | 585 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 325.00 | NaN | NaN | NaN | NaN | 2386146.0 |
| 5 | 22124179 | 726 | 4.4 | 3174 | 236 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 255.98 | NaN | NaN | NaN | NaN | 22124179.0 |
| 6 | 97550707 | 1304 | 4.4 | 6765 | 420 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 240.65 | NaN | NaN | NaN | NaN | 97550707.0 |
| 7 | 1673488 | 1157 | 4.1 | 3807 | 383 | 1153456 | 14299 | 119738 | 4.0 | 457778 | 47 | 454 | 484656.0 | 37.0 | 41242.0 | 4656.0 | 262.06 | NaN | NaN | NaN | NaN | 1673488.0 |
| 8 | 14523104 | 450 | 3.4 | 3993 | 140 | 1199675 | 15240 | 110330 | 3.7 | 834953 | 166 | 798 | 820134.0 | 91.0 | 45675.0 | 5410.0 | 30.45 | NaN | NaN | NaN | NaN | 14523104.0 |
| 9 | 4417694 | 537 | 3.3 | 3485 | 143 | 1199675 | 15240 | 110330 | 3.7 | 834953 | 166 | 798 | 820134.0 | 91.0 | 45675.0 | 5410.0 | 27.44 | NaN | NaN | NaN | NaN | 4417694.0 |
Last rows
| Unnamed: 0 | ratings_count | ratings_average | labels_count | reviews_count | wine_id | winery_id | winery_ratings_count | winery_ratings_average | winery_labels_count | winery_wines_count | region_id | users_count | regions_count | wines_count | wineries_count | median_price | median_discount | price_id | price | price_discount | maybe_this_is_wine_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 84101 | 152284886 | 1082 | 3.8 | 12088 | 429 | 4298429 | 259899 | 5080 | 3.8 | 57326 | 4 | 3106 | 3395333.0 | 554.0 | 324753.0 | 39341.0 | 44.68 | NaN | NaN | NaN | NaN | 152284886.0 |
| 84102 | 141205035 | 3 | 0.0 | 11 | 3 | 4298429 | 259899 | 5080 | 3.8 | 57326 | 4 | 3106 | 3395333.0 | 554.0 | 324753.0 | 39341.0 | 74.22 | NaN | NaN | NaN | NaN | 141205035.0 |
| 84103 | 144389143 | 1033 | 3.7 | 11788 | 402 | 4298429 | 259899 | 5080 | 3.8 | 57326 | 4 | 3106 | 3395333.0 | 554.0 | 324753.0 | 39341.0 | 44.74 | NaN | NaN | NaN | NaN | 144389143.0 |
| 84104 | 156670729 | 440 | 3.9 | 5130 | 168 | 4298429 | 259899 | 5080 | 3.8 | 57326 | 4 | 3106 | 3395333.0 | 554.0 | 324753.0 | 39341.0 | 52.36 | NaN | NaN | NaN | NaN | 156670729.0 |
| 84105 | 5047574 | 685 | 4.0 | 8037 | 234 | 1183738 | 1495 | 14350 | 4.2 | 135640 | 2 | 384 | 3975455.0 | 1280.0 | 490885.0 | 65178.0 | 575.00 | NaN | 17918103.0 | 575.00 | 0.0 | 5047574.0 |
| 84106 | 57244403 | 304 | 4.2 | 5058 | 105 | 1183738 | 1495 | 14350 | 4.2 | 135640 | 2 | 384 | 3975455.0 | 1280.0 | 490885.0 | 65178.0 | 575.00 | NaN | 17918103.0 | 575.00 | 0.0 | 5047574.0 |
| 84107 | 26879557 | 1914 | 3.5 | 10735 | 643 | 1821619 | 223696 | 7937 | 3.5 | 47593 | 4 | 834 | 820134.0 | 91.0 | 45675.0 | 5410.0 | 105.54 | NaN | 22838173.0 | 105.54 | 0.0 | 159860353.0 |
| 84108 | 91412830 | 1078 | 3.7 | 7541 | 357 | 1821619 | 223696 | 7937 | 3.5 | 47593 | 4 | 834 | 820134.0 | 91.0 | 45675.0 | 5410.0 | 105.54 | NaN | 22838173.0 | 105.54 | 0.0 | 159860353.0 |
| 84109 | 3176625 | 439 | 3.4 | 1627 | 169 | 1821619 | 223696 | 7937 | 3.5 | 47593 | 4 | 834 | 820134.0 | 91.0 | 45675.0 | 5410.0 | 105.54 | NaN | 22838173.0 | 105.54 | 0.0 | 159860353.0 |
| 84110 | 151005131 | 848 | 3.7 | 6820 | 300 | 1821619 | 223696 | 7937 | 3.5 | 47593 | 4 | 834 | 820134.0 | 91.0 | 45675.0 | 5410.0 | 105.54 | NaN | 22838173.0 | 105.54 | 0.0 | 159860353.0 |